Picture for Xi Yin

Xi Yin

Human detectors are surprisingly powerful reward models

Add code
Jan 21, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning

Add code
Mar 27, 2025
Figure 1 for UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning
Figure 2 for UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning
Figure 3 for UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning
Figure 4 for UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning
Viaarxiv icon

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Add code
Feb 03, 2025
Viaarxiv icon

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Add code
Dec 20, 2024
Viaarxiv icon

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Add code
Dec 19, 2024
Figure 1 for Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Figure 2 for Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Figure 3 for Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Figure 4 for Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Viaarxiv icon

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

Proactive Schemes: A Survey of Adversarial Attacks for Social Good

Add code
Sep 24, 2024
Figure 1 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 2 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 3 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 4 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Viaarxiv icon

AcademicGPT: Empowering Academic Research

Add code
Nov 21, 2023
Figure 1 for AcademicGPT: Empowering Academic Research
Figure 2 for AcademicGPT: Empowering Academic Research
Figure 3 for AcademicGPT: Empowering Academic Research
Figure 4 for AcademicGPT: Empowering Academic Research
Viaarxiv icon

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Add code
Nov 17, 2023
Viaarxiv icon